General Notebook

Author: Andrew D. Nguyen

Affiliation: Biology Department, University of Vermont

Contact: anbe642@gmail.com

Date started: 2016-05-13

Date end (last modified):

Introduction:
I wish I started an online notebook earlier, but maybe it's not too late? Anyway, I'll use this doc to share my ideas and log the progress of my dissertation.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Table of Contents (Layout follows Page number: Date. Title of entry)

Results from 2016-05-16 ML tree using RaxML black box on CIPRES. ### Transformed branch lengths
### Untransformed branch lengths
Notes: I left a pogo sample in there. LPR4 and HW5 look switched.
###Summary of tree by species:
When comparing with the NJ tree, the placement of A. picea is different. * ML tree: A. picea is sister to A. rudis,A. miamiana, A. lamellidens * NJ tree: A. picea is sister to A. rudis,A. miamiana, A. lamellidens,A. ashmeadi, A. floridana
##Rerunning analysis without PB07-23 to double check this sample doesnt skew ingroup relationships.

Page 7: 2016-05-17. ABI steponeplus machine maintenance.

Machine Problem: It freezes mid run without giving an error, even while operating stand alone. Sometimes when it freezes, the door wont release plate. And it also has trouble connecting to laptop even after restart.

Machine Info: ABI steponeplus
1. serial #: 272007769
2. ref: 4376592
3. University #: A92219

Under contract, no cost.
Contact info:
* Jeremy, 1-800-955-6288 option 3, then option 1
* issue#: 405638599

They need to send to Indonesia for repair. 1 month eta.

20160519 update: tracking number for box (for us to put machine in and send to them)- 6506 8693 8148

Also: > *Hi Andrew,
You should receive a Loaner within 2-3 business days.
Thanks,
Foi Taua

Didn't know we were getting a loaner. He didn't mention cost.

20160520 update: Machine sent out

Previous analyses concatenate SNPS, but many studies use whole rad loci.
Computer cluster: Reference for mason cluster
path of raw ddrad data > /N/dc2/scratch/scahan/Andrew_RADseq_051516/ Data/
SHC email: > If you want to explore/analyze the RADseq data yourself: /N/dc2/scratch/scahan/Aphaenogaster_RADfiles_051516/ You should find in each lane directory the raw .fq file from the sequencer, a barcode key file, the demultiplexed sample .fq files, and the trimming, filtering and mapping files from the pipeline. The earliest lanes (1&2) might have fewer files because the process was not yet regularized back then. The STACKs portion of the pipeline is specific to each project, so they all have their own directories in the main scratch space (e.g., Andrew_RADseq_051516, Bernice_051516, Phytotron_analyses_051516, etc.). All directories at this level have their date suffix modified every two weeks, so job scripts that point to a particular path have to get edited to the current date suffix. Some of the ddRAD lane directories also have a date suffix because they were secondarily moved from the main level into the Aphaenogaster directory.
### Trying pyRAD tutorial. Looks "easy". No access to dependencies: 1. scipy 2. vsearch 3. muscle
20160520 update, working on Mason compute cluster: > Hi Andrew, First, I'd suggest you add "module load python" to your ~/.modules file, which will load the python 2.7.3 module each time you login. It's not terribly current, but it is the version under which we install python packages on Mason. You'll find that numpy and scipy are both available there. As for muscle and vsearch, I'll let you know when we get those packages installed. Matt
### I could use the population function/module in stacks.

Page 13: 2016-05-20. Evolution of proteome stability project

We are interested in the adaptive variation in how proteins unfold between 2 different ant species. Github repo

We isolated native proteins, subjected them to temperature treatments for 10 min. Then ultracentrifuged to pull down aggregates, then quantified. protocols here

Figure:

Function I am fitting to these points:

$min + \frac{1-min}{(1+e^{(-slope(Tm-Temp))})}$

Code for curve fitting, also loading libraries

library(plyr)
library(ggplot2)
library(tidyr)
library(minpack.lm)

nls.fit<-function(data=data){
  y<-nlsLM(unfolding ~ min+ (1-min)/(1+exp((-slope*(Tm-T)))),data=data, 
           start=list(slope=.5,Tm=45,min=.3),
           trace=TRUE,control=nls.control(warnOnly = TRUE, tol = 1e-05, maxiter=1000))
  #return(y)
  return(summary(y)$coefficients)
  }

function to visualize curves by simply putting in paramters


fud<-function(T=seq(25,50,1),Tm=40,slope=.5,max=1,min=0){
  y<-min+ (max-min)/(1+exp((-slope*(Tm-T))))
  return(y)
  }

How I implemented th code:

mod1<-ddply(x.par,.(Species,Colony),nls.fit)
mod1$parameter<-rep(c("slope","Tm","min"),length(mod1$Species)/3)
knitr::kable(mod1)

Table summary of results from fitting curves.

Species Colony Estimate Std. Error t value Pr(>|t|) parameter
A. rudis Duke 1 0.1606280 0.0206403 7.782238 0.0000276 slope
A. rudis Duke 1 47.2920297 0.9451544 50.036301 0.0000000 Tm
A. rudis Duke 1 0.3637620 0.0293990 12.373285 0.0000006 min
A. rudis Lex 13 0.1333902 0.0159832 8.345673 0.0000158 slope
A. rudis Lex 13 49.7593929 1.2760137 38.995972 0.0000000 Tm
A. rudis Lex 13 0.2161279 0.0451703 4.784737 0.0009947 min
A. rudis Yates 2 0.1573466 0.0220329 7.141430 0.0000542 slope
A. rudis Yates 2 47.9849648 1.0899761 44.023870 0.0000000 Tm
A. rudis Yates 2 0.3637813 0.0336777 10.801853 0.0000019 min
P. barbatus WWRQ-45 0.2142567 0.0165774 12.924625 0.0000004 slope
P. barbatus WWRQ-45 45.9987927 0.3837543 119.865208 0.0000000 Tm
P. barbatus WWRQ-45 0.4032438 0.0126671 31.834069 0.0000000 min
P. barbatus WWRQ-53 0.1823480 0.0173963 10.482009 0.0000024 slope
P. barbatus WWRQ-53 47.2858982 0.5958843 79.354167 0.0000000 Tm
P. barbatus WWRQ-53 0.4013122 0.0184886 21.705927 0.0000000 min
P. barbatus WWRQ-8 0.2028211 0.0245990 8.245113 0.0000174 slope
P. barbatus WWRQ-8 45.5664742 0.6340253 71.868543 0.0000000 Tm
P. barbatus WWRQ-8 0.4280916 0.0194756 21.980921 0.0000000 min

Only slope was significant

summary(aov(Estimate~Species,data=slope))
            Df   Sum Sq  Mean Sq F value Pr(>F)  
Species      1 0.003654 0.003654   15.15 0.0177 *
Residuals    4 0.000965 0.000241                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Figure of unfolding if only changing slope (eye balled mean slope, so pogo = .2,rudis=.15 )

Sharing screenshots of sanger sequenced samples mapped to reference transcript (P. barbatus) * I used the software Geneious v6 to analyze sequence data. * Sample structure on figure: Well_colony.id_gene_primer# * The pics and raw sequence data can be found: here * Path: /Dissertation_temperature_adaptation_ants/Dissertation_Projects/2014_xanbe-common-garden_gxp_evolution/Data/sequencing/Sanger/
1. hsc70-4 h2 1468F + 1592R
2. hsp83 278F+392R
3. hsp83 1583F + 1682 R
4. hsp40 541F+ 641R
Summary of results: Most of samples mapped really well! Generally, the sequencing with the forward primer recovers the reverse primer, and vice versa.

Page 17: 2016-05-25. Double check samples for SHC; JSG phytotron exp and MS.

email sent 2016-05-18:

Ok – your list is missing 20-B (AP2), which is on your tree. The two samples with no morphological ID are your RW2 (25-C) and your BP2 (07-B), which will have to get omitted. The only remaining samples whose placements are problematic are your RW1, which was ID’d picea but comes out in that odd basal clade with the intermediate NK samples, and LA4, which was ID’d as rudis but falls out in the middle of picea. Looking at the latter one, however, this is the mysterious LVA9, which was written down as the source for two different experimental colonies (not possible, since they were supposed to be queenright) and there is no way to know if the sample Bernice looked at is the same as the RADseq sample or the assayed colony. So there is good reason to throw that one out as well.

Need to double check these samples.

2016-05-26 UPdate- Ecluding:

  1. 25-C / RW2
  2. 07-B / BP2
  3. 10-F / LA4
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3959 on 34 degrees of freedom Multiple R-squared: 0.908, Adjusted R-squared: 0.8971 F-statistic: 83.85 on 4 and 34 DF, p-value: < 2.2e-16 ``` Looksl ike first 3 axes are significant: choose these in regression models with Tmax
### Check correlation between bioclim variables and phylogenetic components
| | Axis.1| Axis.2| Axis.3| Axis.4| bio5| bio6| bio7| mergnb∣∣ :  −  −  −  −  −  −  − ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  : ∣ −  −  −  −  −  −  −  : ∣∣Axis.1∣1.000∣0.000∣0.000∣0.000∣0.882∣0.745∣ − 0.454∣ − 0.258∣∣Axis.2∣0.000∣1.000∣0.000∣0.000∣0.159∣0.139∣ − 0.089∣0.023∣∣Axis.3∣0.000∣0.000∣1.000∣0.000∣0.151∣0.301∣ − 0.327∣ − 0.321∣∣Axis.4∣0.000∣0.000∣0.000∣1.000∣ − 0.044∣ − 0.090∣0.099∣0.072∣∣bio5∣0.882∣0.159∣0.151∣ − 0.044∣1.000∣0.772∣ − 0.411∣ − 0.412∣∣bio6∣0.745∣0.139∣0.301∣ − 0.090∣0.772∣1.000∣ − 0.897∣ − 0.728∣∣bio7∣ − 0.454∣ − 0.089∣ − 0.327∣0.099∣ − 0.411∣ − 0.897∣1.000∣0.757∣∣mergnb | -0.258| 0.023| -0.321| 0.072| -0.412| -0.728| 0.757| 1.000|
### Model subsets
Construct full model to test interaction between Tma and each eigenvector(part of phylogeny
```R #Ctmax = upper thermal limit # Axis1 - picea rudis split # Axis2 - N-S rudis clade split # Axis 3 - Pica split # Rearing temp: 20(23?) and 26 #Bio 5 = Tmax
lm(Ctmax~bio5Axis.1+bio5Axis.2+bio5*Axis.3+Rearing.temp,data=merg) ```
Showing table of model subsets generated from dredge() function
| | (Intercept)| Axis.1| Axis.2| Axis.3| bio5| Rearing.temp| Axis.1:bio5| Axis.2:bio5| Axis.3:bio5| df| logLik| AICc| delta| weight| |:---|-----------:|---------:|----------:|-----------:|---------:|------------:|-----------:|-----------:|-----------:|--:|----------:|--------:|---------:|---------:| |112 | 38.89633| -64.54042| 163.379439| -7.1344145| 0.1142023| NA| 2.536627| -5.254042| NA| 8| -7.240543| 35.28109| 0.000000| 0.4265037| |42 | 40.27150| -32.80201| NA| NA| 0.0688105| NA| 1.425184| NA| NA| 5| -13.003298| 37.82478| 2.543691| 0.1195549| |128 | 38.47028| -66.64409| 164.835542| -7.4660612| 0.1213695| 0.0095824| 2.601996| -5.309316| NA| 9| -7.070997| 38.34889| 3.067805| 0.0919936| |240 | 38.99929| -63.44634| 165.175057| -17.1348640| 0.1105860| NA| 2.500507| -5.300206| 0.3704068| 9| -7.230877| 38.66865| 3.387565| 0.0784011| |44 | 39.58611| -45.63541| -2.113326| NA| 0.0908785| NA| 1.833044| NA| NA| 6| -12.197967| 39.02093| 3.739847| 0.0657393| |108 | 40.67312| -34.07965| 68.733204| NA| 0.0549570| NA| 1.516668| -2.192978| NA| 7| -10.725474| 39.06385| 3.782765| 0.0643437| |46 | 40.41404| -31.25480| NA| 0.5303359| 0.0639742| NA| 1.377435| NA| NA| 6| -12.955310| 40.53562| 5.254534| 0.0308259| |58 | 40.27478| -32.80931| NA| NA| 0.0687900| -0.0001219| 1.425440| NA| NA| 6| -13.003275| 40.63155| 5.350465| 0.0293822| |48 | 39.02228| -53.61002| -2.839872| -1.2211435| 0.1096013| NA| 2.083210| NA| NA| 7| -12.029592| 41.67209| 6.391001| 0.0174636| |60 | 39.47928| -45.71692| -2.160369| NA| 0.0919412| 0.0034085| 1.834965| NA| NA| 7| -12.180302| 41.97351| 6.692423| 0.0150204| |256 | 38.58207| -65.43836| 166.850595| -18.6312806| 0.1173856| 0.0096530| 2.562160| -5.361253| 0.4134582| 10| -7.058857| 41.97486| 6.693772| 0.0150103| |124 | 40.68974| -34.04638| 68.875613| NA| 0.0547434| -0.0004636| 1.515799| -2.197188| NA| 8| -10.725127| 42.25025| 6.969168| 0.0130794| |174 | 40.17971| -36.89376| NA| -39.0161170| 0.0707023| NA| 1.545609| NA| 1.4424428| 7| -12.647902| 42.90871| 7.627622| 0.0094104| |62 | 40.43434| -31.27748| NA| 0.5367575| 0.0637997| -0.0006914| 1.378308| NA| NA| 7| -12.954602| 43.52211| 8.241021| 0.0069248| |176 | 38.62972| -58.09940| -4.104072| 36.3065961| 0.1233954| NA| 2.234488| NA| -1.3972498| 8| -11.917041| 44.63408| 9.352996| 0.0039714| |64 | 38.74224| -54.92127| -3.032774| -1.3987999| 0.1142952| 0.0063182| 2.123167| NA| NA| 8| -11.971892| 44.74378| 9.462699| 0.0037594| |76 | 42.07767| 13.63416| 119.882620| NA| 0.0151404| NA| NA| -3.677760| NA| 6| -15.148179| 44.92136| 9.640273| 0.0034400| |8 | 42.43692| 11.87677| 2.870941| 3.7234291| NA| NA| NA| NA| NA| 5| -16.905003| 45.62819| 10.347102| 0.0024159| |190 | 40.09493| -37.03074| NA| -40.5928212| 0.0716161| 0.0025740| 1.548960| NA| 1.4990805| 8| -12.638411| 46.07682| 10.795736| 0.0019305| |6 | 42.43692| 11.87677| NA| 3.7234291| NA| NA| NA| NA| NA| 4| -19.294824| 47.76612| 12.485033| 0.0008295|
###Cumulative AIC weights
# 2016-06-01 continued : Actually model averaging
###top 2 AIC ```R >summary(model.avg(a.max[1:2]))
Full model-averaged coefficients (with shrinkage): Estimate Std. Error Adjusted SE z value Pr(>|z|) (Intercept) 39.19741 1.81718 1.88065 20.842 < 2e-16 Axis.1 -57.59157 22.32729 22.90080 2.515 0.01191 Axis.2 127.60890 83.60797 84.76171 1.506 0.13220 Axis.3 -5.57240 3.87984 3.94483 1.413 0.15778 bio5 0.10426 0.06385 0.06611 1.577 0.11477 Axis.1:bio5 2.29329 0.74450 0.76259 3.007 0.00264 Axis.2:bio5 -4.10371 2.67227 2.70829 1.515 0.12971

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Relative variable importance: Axis.1 bio5 Axis.1:bio5 Axis.2 Axis.3 Axis.2:bio5 Importance: 1.00 1.00 1.00 0.78 0.78 0.78
N containing models: 2 2 2 1 1 1
```

top 6 AIC

>summary(model.avg(a.max[1:6]))
Full model-averaged coefficients (with shrinkage): 
               Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)   39.242396   1.895875    1.960765  20.014  < 2e-16 ***
Axis.1       -56.401968  22.642086   23.231252   2.428  0.01519 *  
Axis.2       120.584643  84.389355   85.517189   1.410  0.15852    
Axis.3        -5.992746  25.128163   26.111410   0.230  0.81848    
bio5           0.101921   0.065747    0.068039   1.498  0.13414    
Axis.1:bio5    2.251255   0.751716    0.770321   2.922  0.00347 ** 
Axis.2:bio5   -3.881627   2.688644    2.723787   1.425  0.15413    
Rearing.temp   0.001041   0.006764    0.006986   0.149  0.88151    
Axis.3:bio5    0.034305   0.915567    0.952226   0.036  0.97126    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Relative variable importance: 
                     Axis.1 bio5 Axis.1:bio5 Axis.2 Axis.2:bio5 Axis.3
Importance:          1.00   1.00 1.00        0.86   0.78        0.71  
N containing models:    6      6    6           5      4           3  
                     Rearing.temp Axis.3:bio5
Importance:          0.11         0.09       
N containing models:    1            1       

top 10 AIC

>summary(model.avg(a.max[1:10])) 

Full model-averaged coefficients (with shrinkage): 
               Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)   3.931e+01  1.901e+00   1.965e+00  20.004  < 2e-16 ***
Axis.1       -5.462e+01  2.281e+01   2.337e+01   2.337  0.01945 *  
Axis.2        1.086e+02  8.793e+01   8.891e+01   1.221  0.22190    
Axis.3       -5.407e+00  2.393e+01   2.486e+01   0.218  0.82782    
bio5          9.962e-02  6.590e-02   6.817e-02   1.461  0.14391    
Axis.1:bio5   2.187e+00  7.598e-01   7.775e-01   2.813  0.00491 ** 
Axis.2:bio5  -3.499e+00  2.803e+00   2.833e+00   1.235  0.21689    
Rearing.temp  9.893e-04  7.724e-03   7.987e-03   0.124  0.90143    
Axis.3:bio5   3.092e-02  8.693e-01   9.041e-01   0.034  0.97272    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Relative variable importance: 
                     Axis.1 bio5 Axis.1:bio5 Axis.2 Axis.2:bio5 Axis.3
Importance:          1.00   1.00 1.00        0.81   0.70        0.69  
N containing models:   10     10   10           7      4           5  
                     Rearing.temp Axis.3:bio5
Importance:          0.15         0.08       
N containing models:    3            1     

top 4 delta AIC

>summary(model.avg(a.max, subset = delta < 4))

Full model-averaged coefficients (with shrinkage): 
               Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)   39.242396   1.895875    1.960765  20.014  < 2e-16 ***
Axis.1       -56.401968  22.642086   23.231252   2.428  0.01519 *  
Axis.2       120.584643  84.389355   85.517189   1.410  0.15852    
Axis.3        -5.992746  25.128163   26.111410   0.230  0.81848    
bio5           0.101921   0.065747    0.068039   1.498  0.13414    
Axis.1:bio5    2.251255   0.751716    0.770321   2.922  0.00347 ** 
Axis.2:bio5   -3.881627   2.688644    2.723787   1.425  0.15413    
Rearing.temp   0.001041   0.006764    0.006986   0.149  0.88151    
Axis.3:bio5    0.034305   0.915567    0.952226   0.036  0.97126    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Relative variable importance: 
                     Axis.1 bio5 Axis.1:bio5 Axis.2 Axis.2:bio5 Axis.3
Importance:          1.00   1.00 1.00        0.86   0.78        0.71  
N containing models:    6      6    6           5      4           3  
                     Rearing.temp Axis.3:bio5
Importance:          0.11         0.09       
N containing models:    1            1       

Comparing output to stepwise AIC both directions

> summary(stepAIC(full.max,direction="both"))
Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  38.89633    1.77085  21.965  < 2e-16 ***
bio5          0.11420    0.06223   1.835 0.075805 .  
Axis.1      -64.54042   19.60370  -3.292 0.002429 ** 
Axis.2      163.37944   55.72789   2.932 0.006179 ** 
Axis.3       -7.13441    2.85109  -2.502 0.017640 *  
bio5:Axis.1   2.53663    0.63426   3.999 0.000351 ***
bio5:Axis.2  -5.25404    1.76036  -2.985 0.005402 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3216 on 32 degrees of freedom
Multiple R-squared:  0.9428,    Adjusted R-squared:  0.9321 
F-statistic: 87.96 on 6 and 32 DF,  p-value: < 2.2e-16

SHC suggestion: Just include all phylo axes in all analyses

full.max<-lm(Ctmax~bio5*Axis.1+bio5*Axis.2+bio5*Axis.3+bio5*Axis.4+Rearing.temp,data=merg)

Showing top 2 AIC

summary(model.avg(a.max[1:2]))

Full model-averaged coefficients (with shrinkage): 
             Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)  39.19741    1.81718     1.88065  20.842  < 2e-16 ***
Axis.1      -57.59157   22.32729    22.90080   2.515  0.01191 *  
Axis.2      127.60890   83.60797    84.76171   1.506  0.13220    
Axis.3       -5.57240    3.87984     3.94483   1.413  0.15778    
bio5          0.10426    0.06385     0.06611   1.577  0.11477    
Axis.1:bio5   2.29329    0.74450     0.76259   3.007  0.00264 ** 
Axis.2:bio5  -4.10371    2.67227     2.70829   1.515  0.12971    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Relative variable importance: 
                     Axis.1 bio5 Axis.1:bio5 Axis.2 Axis.3 Axis.2:bio5
Importance:          1.00   1.00 1.00        0.78   0.78   0.78       
N containing models:    2      2    2           1      1      1    

Showing stepwise variable selection

> summary(stepAIC(full.max,direction="both"))
Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  38.89633    1.77085  21.965  < 2e-16 ***
bio5          0.11420    0.06223   1.835 0.075805 .  
Axis.1      -64.54042   19.60370  -3.292 0.002429 ** 
Axis.2      163.37944   55.72789   2.932 0.006179 ** 
Axis.3       -7.13441    2.85109  -2.502 0.017640 *  
bio5:Axis.1   2.53663    0.63426   3.999 0.000351 ***
bio5:Axis.2  -5.25404    1.76036  -2.985 0.005402 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3216 on 32 degrees of freedom
Multiple R-squared:  0.9428,    Adjusted R-squared:  0.9321 
F-statistic: 87.96 on 6 and 32 DF,  p-value: < 2.2e-16

2016-06-02 update:

full mod construction for all traits

#Ctmax
full.max<-lm(Ctmax~bio5*Axis.1+bio5*Axis.2+bio5*Axis.3+bio5*Axis.4+Rearing.temp,data=merg)

#Ctmin
full.min<-lm(Ctmin~bio6*Axis.1+bio6*Axis.2+bio6*Axis.3+bio6*Axis.4+Rearing.temp,data=merg)

#thermal tolerance breadth
TNB.full<-lm(nb~Axis.1*bio7+Axis.2*bio7+Axis.3*bio7+Axis.4*bio7+Rearing.temp,data=merg)

Probably a poor way to show output, but you can see the consistency with model averaging at different criteria for selecting top model (Top 2/6/10 AIC, < delta 4, 95 conf int):

full table

Ctmax X X.1 X.2 X.3 X.4 X.5 Ctmin X.6 X.7 X.8 X.9 X.10 X.11 TNB X.12 X.13 X.14 X.15 X.16
top 2 AICc NA top 2 AICc NA top 2 AICc
Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|)
(Intercept) 39.1974113 1.81717555 1.88065146 20.842464 0 NA (Intercept) 6.7081182 0.2276948 0.23542884 28.4931876 0 NA (Intercept) 27.0459363 1.9059241 1.96848068 13.7394979 0
Axis.1 -57.5915669 22.32728931 22.90079918 2.514828 0.01190905 NA Axis.2 -2.1928922 2.3776364 2.41435204 0.9082736 0.3637337 NA bio7 0.3458702 0.05107519 0.05275118 6.5566338 0
Axis.2 127.6089015 83.60796871 84.76171296 1.505502 0.13219514 NA bio6 0.4381219 0.0216451 0.02237847 19.5778289 0 NA Axis.1 0.4799895 1.20994269 1.23717296 0.3879728 0.6980362
Axis.3 -5.5723952 3.87984439 3.94482507 1.412584 0.1577782 NA NA
bio5 0.1042642 0.06385464 0.06611108 1.577106 0.11477121 NA NA
Axis.1:bio5 2.2932857 0.74450301 0.76259317 3.00722 0.00263649 NA NA
Axis.2:bio5 -4.1037142 2.67226625 2.70829115 1.515241 0.12971134 NA NA
NA NA
top 6 AICc NA top 6 AIC NA top 6 AIC
Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|)
(Intercept) 39.09770833 1.858677254 1.925749929 20.30258848 0 NA (Intercept) 6.684988297 0.34810913 0.35955592 18.59234664 0 NA (Intercept) 27.10029979 1.93125923 1.9947008 13.58614773 0
Axis.1 -58.89921078 22.04271462 22.67258208 2.59781663 0.0093819 NA Axis.2 -2.395005615 2.383652 2.42548151 0.98743511 0.3234294 NA bio7 0.340556895 0.05066679 0.05234842 6.505580624 0
Axis.2 128.5516463 84.09957359 85.29260627 1.50718394 0.1317635 NA bio6 0.434621522 0.02580208 0.02660974 16.33317269 0 NA Axis.1 0.233929172 0.87808967 0.89639122 0.26096772 0.7941174
Axis.3 -6.557056311 24.86761943 25.84512303 0.25370575 0.7997229 NA Axis.1 0.237902792 0.84028327 0.86112014 0.27627131 0.7823397 NA Rearing.temp 0.006336015 0.02480004 0.02533232 0.250115874 0.8024977
bio5 0.106760957 0.064593101 0.066955014 1.59451773 0.1108201 NA Axis.4 0.112568203 1.35995056 1.40562657 0.080084 0.9361704 NA Axis.2 0.384777978 1.53920801 1.57274344 0.244654003 0.8067243
Axis.1:bio5 2.335012045 0.732742988 0.752658592 3.10235221 0.0019199 NA Rearing.temp -0.000455203 0.01026543 0.01062611 0.04283813 0.9658306 NA Axis.3 -0.405958108 1.89199147 1.93749909 0.209526864 0.834037
Axis.2:bio5 -4.139188942 2.678608124 2.715902917 1.5240563 0.1274946 NA NA Axis.4 -0.020203078 2.08189989 2.15420639 0.009378432 0.9925172
Rearing.temp 0.001023202 0.006706439 0.006926472 0.14772334 0.8825611 NA NA
Axis.4 0.040415014 0.730962899 0.759753398 0.05319491 0.9575766 NA NA
Axis.3:bio5 0.033707897 0.907576414 0.943914621 0.03571075 0.971513 NA NA
NA NA
top10 AICc NA top 10 AIC NA top 10 AIC
Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|)
(Intercept) 3.93E+01 1.899791736 1.963911039 20.01693332 0 NA (Intercept) 6.708401068 0.37935474 0.39165216 17.1284671 0 NA (Intercept) 26.99566918 1.97438106 2.03890627 13.2402698 0
Axis.1 -5.49E+01 22.98798186 23.54781004 2.33039265 0.0197854 NA Axis.2 -2.306237463 2.41991077 2.46193139 0.936759437 0.3488823 NA bio7 0.341349078 0.05167576 0.05337701 6.3950576 0
Axis.2 1.13E+02 87.18011426 88.20794399 1.28066363 0.2003118 NA bio6 0.434528345 0.0260478 0.0268662 16.17379177 0 NA Axis.1 0.333219277 1.03492288 1.0576099 0.3150682 0.7527099
Axis.3 -5.52E+00 22.98845903 23.88232484 0.23133093 0.8170577 NA Axis.1 0.134917912 0.91388635 0.93807943 0.143823548 0.8856398 NA Rearing.temp 0.009351372 0.0297777 0.03043434 0.3072639 0.7586425
bio5 9.98E-02 0.06595076 0.068221677 1.46265473 0.1435619 NA Axis.4 0.084008834 1.17585796 1.21528353 0.069126942 0.9448886 NA Axis.2 0.449552864 1.65384841 1.69015606 0.265983 0.7902523
Axis.1:bio5 2.20E+00 0.766278694 0.783913018 2.80383221 0.0050499 NA Rearing.temp -0.001159531 0.01228646 0.01268573 0.091404373 0.9271713 NA Axis.3 -0.513063577 2.10220292 2.15127673 0.2384926 0.8114991
Axis.2:bio5 -3.64E+00 2.782129711 2.81413218 1.29206427 0.1963349 NA Axis.3 -0.018400092 0.73655574 0.76270452 0.024124797 0.9807531 NA Axis.4 -32.1405597 171.6554339 174.0144258 0.1847005 0.8534639
Rearing.temp 8.61E-04 0.007016252 0.007252224 0.1187358 0.9054847 NA Axis.2:bio6 -0.001287445 0.19407722 0.20101898 0.006404594 0.9948899 NA Axis.4:bio7 0.985550403 5.26653 5.33891153 0.1845976 0.8535446
Axis.4 -5.58E-03 0.843672767 0.874007493 0.00638567 0.994905 NA Axis.1:bio6 -0.02598336 0.13293313 0.13465174 0.192967133 0.8469847 NA
Axis.3:bio5 2.85E-02 0.834371842 0.867772028 0.03282359 0.9738153 NA NA
NA NA
delta 4 NA delta 4 NA delta 4
Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|)
(Intercept) 3.92E+01 1.893976934 1.959535391 20.00841235 0 NA (Intercept) 6.705813971 0.36859922 0.38057547 17.62019506 0 NA (Intercept) 26.99362954 1.98319436 2.04854756 13.1769601 0
Axis.1 -5.72E+01 22.60477016 23.20900583 2.46345624 0.0137605 NA Axis.2 -2.092714799 2.40013254 2.43860238 0.85816155 0.3908033 NA bio7 0.341410566 0.05185096 0.05357386 6.3727077 0
Axis.2 1.24E+02 83.35210021 84.53449839 1.4715241 0.1411494 NA bio6 0.434592864 0.02574957 0.0265647 16.35978856 0 NA Axis.1 0.332214471 1.03734215 1.06050509 0.3132606 0.7540827
Axis.3 -6.10E+00 24.04585344 24.98659536 0.24418582 0.8070869 NA Axis.1 0.122426556 0.87143086 0.89445417 0.136872923 0.8911312 NA Rearing.temp 0.009346593 0.02990764 0.03058437 0.3056003 0.759909
bio5 1.03E-01 0.065747663 0.068062898 1.51566905 0.1296031 NA Axis.4 0.125415028 1.4735481 1.52274366 0.082361222 0.9343595 NA Axis.2 2.838029586 17.73196229 18.10766781 0.1567308 0.875457
Axis.1:bio5 2.28E+00 0.749621892 0.768722868 2.96354009 0.0030412 NA Rearing.temp -0.001052176 0.0117087 0.01208889 0.087036633 0.9306424 NA Axis.3 -0.629285835 2.34142278 2.39958368 0.2622479 0.7931303
Axis.2:bio5 -4.00E+00 2.655253834 2.692134467 1.48727215 0.1369429 NA Axis.3 -0.018674382 0.92571971 0.95829077 0.019487177 0.9844525 NA Axis.4 -27.62992928 159.5462112 161.7281284 0.1708418 0.8643481
Rearing.temp 9.52E-04 0.006474441 0.006686524 0.14238996 0.886772 NA Axis.2:bio6 -0.001168247 0.18487511 0.19148771 0.006100898 0.9951322 NA Axis.4:bio7 0.847237516 4.89499575 4.96194424 0.1707471 0.8644226
Axis.4 3.76E-02 0.705181274 0.732950521 0.05130819 0.9590799 NA Axis.1:bio6 -0.023577694 0.12685366 0.12848792 0.183501249 0.8544047 NA Axis.2:bio7 -0.064576645 0.51493977 0.52615353 0.1227335 0.9023182
Axis.3:bio5 3.14E-02 0.875514464 0.910565657 0.03444602 0.9725215 NA NA
NA NA
95 conf int NA 95 conf int NA 95 conf int
Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|) NA Estimate st SE Adjusted SE z value Pr(>|z|)
(Intercept) 39.34503249 1.92565228 1.99050365 19.76637044 0 NA (Intercept) 6.700287577 0.44563973 0.46035289 14.55467701 0 NA (Intercept) 26.9140026 2.16190181 2.23237161 12.05623762 0
Axis.1 -54.06305882 23.2473386 23.81149971 2.27046005 0.0231797 NA Axis.2 -2.082292553 2.57543166 2.62119065 0.79440713 0.4269585 NA bio7 0.34177799 0.05653227 0.05838979 5.85338647 0
Axis.2 105.8730629 88.45648472 89.42203144 1.18397067 0.2364247 NA bio6 0.430727698 0.03052731 0.03143785 13.70092854 0 NA Axis.1 1.57794905 11.89650289 12.18647211 0.12948366 0.896975
Axis.3 -5.432382412 26.52153131 27.54590024 0.19721201 0.8436616 NA Axis.1 0.22267537 1.25918711 1.2934862 0.17215133 0.8633186 NA Rearing.temp 0.01174595 0.03343028 0.03422627 0.34318537 0.731459
bio5 0.098505527 0.066665352 0.068957648 1.42849314 0.15315 NA Axis.4 0.171362227 2.27629114 2.35191008 0.07286088 0.9419168 NA Axis.2 4.24708937 26.46804905 27.17670287 0.15627684 0.8758148
Axis.1:bio5 2.167761235 0.774452089 0.79221653 2.73632417 0.006213 NA Rearing.temp -0.001971928 0.01542466 0.01593332 0.12376128 0.9015043 NA Axis.3 2.55023555 39.50379104 40.44935619 0.06304762 0.9497286
Axis.2:bio5 -3.410786773 2.820916563 2.850966 1.19636179 0.2315554 NA Axis.3 0.344867152 5.82039656 5.9665753 0.05779985 0.9539081 NA Axis.4 -43.46029486 204.0829434 207.3599239 0.20958869 0.8339887
Rearing.temp 0.001015674 0.008166926 0.008450716 0.12018797 0.9043342 NA Axis.2:bio6 -0.028382406 0.36667178 0.37805671 0.07507447 0.9401555 NA Axis.4:bio7 1.32978885 6.25950379 6.36007272 0.20908391 0.8343827
Axis.4 2.178787485 41.79217286 43.10770462 0.05054288 0.9596898 NA Axis.1:bio6 -0.054579821 0.21631607 0.21965669 0.24847785 0.8037647 NA Axis.2:bio7 -0.10008424 0.7756989 0.79685177 0.12559957 0.9000489
Axis.3:bio5 0.037161754 0.968321351 1.006521472 0.03692097 0.970548 NA Axis.3:bio6 0.016466532 0.41755684 0.42796108 0.0384767 0.9693076 NA Axis.1:bio7 -0.03381928 0.34694592 0.35551648 0.09512717 0.9242138
Axis.4:bio5 -0.06684784 1.259095332 1.298786073 0.05146948 0.9589514 NA Axis.4:bio6 0.070484176 1.61820244 1.66039021 0.04245037 0.9661397 NA Axis.3:bio7 -0.08611658 0.98198428 1.00527754 0.08566448 0.9317331
Use function 'rda' to test significance of fractions of interest ```
Looking at plots
R #global model: a+b+c anova(rda(merg$Ctmax~Axis.1+Axis.2+Axis.3+Axis.4+bio5,data=merg)) #fraction a+b ab<-rda(merg$Ctmax~Axis.1+Axis.2+Axis.3+Axis.4,data=merg) anova(ab) #frac b+c bc<-rda(merg$Ctmax~bio5,data=merg) anova(bc) #fraction a (phylo) a<-rda(merg$Ctmax~Axis.1+Axis.2+Axis.3+Axis.4+Condition(bio5),data=merg) anova(a) #fraction c (eco) c<-rda(merg$Ctmax~Condition(Axis.1+Axis.2+Axis.3+Axis.4)+bio5,data=merg) anova(c) Only showing code for CTmax I also applied variance paritioning for Ctmin and thermal tolerance breadth
|Trait |Independent.Phylogeny |Independent.Ecology |Phylogeny |Ecology | Phylogeny.and.Ecology|Full | Residual| |:-----------------|:---------------------|:-------------------|:---------|:-------|---------------------:|:-----|--------:| |Ctmax |0.14 |0 |0.90 |0.75 | 0.75|0.90 | 0.10| |Ctmin |0 |0.31 |0.64 |0.92 | 0.60|0.91 | 0.09| |Tolerance Breadth |0 |0.45 |0.17 |0.57 | 0.11|0.53 | 0.47| Note-Bolded values represents significant variance component. The combined phylogeny and ecology variance component can not be tested for significance, only indirectly measured. The ecological component is represented by Tmax for Ctmax, Tmin for CTmin, and TAR for tolerance breadth.
### Different way to represent proportion of variance explained by each component
CTmax
CTmin

Notes from climate cascade meeting, 2016-06-01

I have meetings with SHC and NJG every week, so I'll start logging our discussions here

We talked about the analysis from the thermal niche paper:
1. NJG and SHC don't have strong feelings about model averaging.
2. FOr the 4 panel field vs phytotron CTmax and Ctmin figure, keep separate lines for each species 3. For thermal tolerance breadth, make 1 line
4. Include variance partitioning analysis: Estimate amount of variance that go into phylogenetic components, ecological component, and their shared component.
5. For CTmax , perform a Levine's test on the raw residuals from the regression line for picea (field vs phytotron).
6. NJG: What does the literature say? Do people compare field vs common garden often? Do people assay thermal tolerance in the field alone?

Writing this up SHC suggestion for results: Talk about field, then phyto, then present thermal tolerance breadth for phytotron.

For the phytroton gxp paper:
1. Remake boxplot to include Axis 2

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
#visualizing boxplot(lt[,1]~lt[,2],ylab="raw residuals",las=1) ```
##Summary: Yes, sig diff in variance between field and phyto.

Page 22: 2016-06-02. Brute force fitting nls() functions in R!!

I googled how to fit nls even when failing to converge in R and found this gem.

Basically, use nls2() to brute force fit curves. I have not tried it, but putting it here as a ref.

Page 23: 2016-06-02. Literature reference for thermal niche paper to help write manuscript

Probably not comprehensive, but here it is:
Thermal breadth = 1 if they analyze it, 0 if they don't.

Table:

Type Author Year Journal Taxa Rearing_acclimation.Temperature Heat_tolerance_Trait Cold_tolerance_trait Thermal_Breadth Locale
Meta-analysis Addo-Bediako et al. 2000 Proceedings of the royal society b Insects NA Global
Lab acclimation Deere & Chown 2006 American Naturalist Mites 1; 5; 10; 15 Locomotion Locomotion 1 Southern Ocean
Field Compton et al. 2007 Experimental marine biology and ecology Bivalve Ctmax Ctmin 1 Europe
Lab acclimation Calosi et al. 2008 Biology letters Beetles 14.5;20.5 Ctmax Ctmin 0 Europe
Lab acclimation Calosi et al. 2008 Journal of biogeography Beetles 14.5; 20.5 Ctmax Ctmin 1 Africa to Europe
Field Sinervo et al. 2010 Science Lizards Tb 0 Mexico
Lab acclimation Calosi et al. 2010 Journal of Animal Ecology Beetles 14.5; 20.5 Ctmax Ctmin 1 USA
Lab acclimation Anert et al. 2011 Integrative and Comparative Biology Plants 20-24 RGR RGR 1 USA
Meta-analysis Sunday et al. 2011 Proceedings of the royal society b Terrestrial and Marine Ctmax Ctmin 1 Global
Common garden Overgaard et al. 2011 American Naturalist Fruit Fly 25;29 Ctmax;KO Ctmin;KO 1 Australia
Common garden Krenek et al. 2012 Plosone Paramecium 22 GR GR 1 Europe
Meta-analysis Grigg & Buckley 2012 Biology letters Lizards Ctmax Ctmin 1 Global
Short acclimation Sheldon & Tewksbury 2014 Ecology Beetles 20 Ctmax Ctmin 1 North and Central America
Common garden Sheth & Angert 2014 Evolution Plants 20-25 RGR RGR 1 North America
Meta-analysis Khaliq et al. 2014 Proceedings of the royal society b Birds and Mammal Ctmax Ctmin 1 Global
Short acclimation Sheldon et al. 2015 Global Ecology and Biogeography Lizards 29 Ctmax Ctmin 1 Argentina
Lab acclimation Bonino et al. 2015 Zoology Lizards 20-40 Ctmax Ctmin 1 Argentina
Velasco et al. 2016 Journal of biogeography NA Central America
Meta-analysis Lancaster 2016 Nature Climate Change Insects Ctmax Ctmin 1 Global
Lab acclimation Gutierrez-Pesquera et al. 2016 Journal of biogeography Frogs (tadpoles) 20 Ctmax Ctmin 1 Global
### I tried this on my desktop to play with the data quick and dirty, but it should go in my dissertation repo: So the main problem I had in the past was that nls would stop if it the fit was poor, nls2() will brute force fit curves.
Here is my mock dataset:
Colony | temp| FC_hsc70_1468| FC_Hsp83_279| FC_Hsp83_1583| FC_hsp40_424| T| |:------|----:|-------------:|------------:|-------------:|------------:|----:| |SHC6 | 25.0| 0.8180765| 1.2727190| 1.3741141| 1.5064240| 25.0| |SHC6 | 28.0| 0.8999074| 1.3778736| 2.3077710| 1.9297926| 28.0| |SHC6 | 30.0| 0.7922560| 0.9294879| 1.1390051| 1.2217515| 30.0| |SHC6 | 31.5| 0.8561583| 1.1546421| 0.8679142| 1.0613295| 31.5| |SHC6 | 33.0| 3.3855425| 1.9787656| 1.8540116| 2.6265211| 33.0| |SHC6 | 35.0| 7.1917199| 2.5450325| 3.5009441| 4.0735230| 35.0| |SHC6 | 36.5| 19.4708137| 3.4314556| 4.3630936| 5.6521932| 36.5| |SHC6 | 38.5| 30.8610304| 4.2174121| 6.7580144| 6.2792960| 38.5| |SHC6 | 40.0| 32.5603639| 4.6504188| 7.5401674| 9.2319657| 40.0| |SHC6 | 41.0| 26.0984907| 2.8898872| NA| 7.1626251| 41.0| |Avon | 25.0| 1.1732547| 1.2784472| 1.1390452| 1.1012987| 25.0| |Avon | 28.0| 1.4387152| 1.5022087| 1.3336969| 1.3226495| 28.0| |Avon | 30.0| 0.8752047| 1.1583008| 1.2902416| 0.7690966| 30.0| |Avon | 31.5| 1.1998622| 1.0781117| 1.2132109| 0.7942291| 31.5| |Avon | 33.0| 2.0881946| 2.1492356| 2.6482339| 1.8422990| 33.0| |Avon | 35.0| 6.7926522| NA| 4.8219890| 3.9911963| 35.0| |Avon | 36.5| 10.7125651| 4.0352515| 5.5000395| 4.0940620| 36.5| |Avon | 38.5| 22.8261858| 7.0736972| 9.0038236| 6.7629965| 38.5| |Avon | 40.0| NA| NA| NA| NA| 40.0| |Avon | 41.0| 32.0884860| 10.1245880| 12.7812078| 11.7503167| 41.0| |KH7 | 25.0| 0.8116304| 0.7712080| 0.9304326| 0.8853779| 25.0| |KH7 | 28.0| 1.0896696| 1.1911849| 1.1219525| 0.9427790| 28.0| |KH7 | 30.0| 1.1757139| 1.2275952| 1.3403029| 0.9426073| 30.0| |KH7 | 31.5| 1.3429711| 2.1066143| 1.7442530| 1.1386698| 31.5| |KH7 | 33.0| 3.7095882| 3.2454970| 2.8123354| 1.9888572| 33.0| |KH7 | 35.0| 7.6945833| 3.1906332| 3.0515822| 2.2617349| 35.0| |KH7 | 36.5| 19.6792961| 7.5792950| 5.6249460| 4.7316773| 36.5| |KH7 | 38.5| 25.7475125| 6.0603869| 5.5386550| 4.9766214| 38.5| |KH7 | 40.0| 47.1850131| 12.1240032| 9.7991379| 8.5075648| 40.0| |KH7 | 41.0| 44.5367758| 11.4567417| 9.6551695| 9.7848318| 41.0| |test | 25.0| 1.0000000| 10.0000000| 5.0000000| 1.0000000| 25.0| |test | 28.0| 1.0000000| 9.0000000| 5.0000000| 2.0000000| 28.0| |test | 30.0| 2.0000000| 8.0000000| 5.0000000| 4.0000000| 30.0| |test | 31.5| 3.0000000| 7.0000000| 5.0000000| 6.0000000| 31.5| |test | 33.0| 4.0000000| 6.0000000| 5.0000000| 8.0000000| 33.0| |test | 35.0| 5.0000000| 5.0000000| 5.0000000| 8.0000000| 35.0| |test | 36.5| 6.0000000| 4.0000000| 5.0000000| 8.0000000| 36.5| |test | 38.5| 7.0000000| 3.0000000| 5.0000000| 8.0000000| 38.5| |test | 40.0| 8.0000000| 2.0000000| 5.0000000| 8.0000000| 40.0| |test | 41.0| 9.0000000| 1.0000000| 5.0000000| 8.0000000| 41.0|
1. Now I have to fit curves (boltzmann function) to for each colony and each gene (FC_70, FC_83, and FC_40). You can see I have a test colony with made up numbers, these should be poor fits.
I'm using nls2() and this curve estimates the critical temperature (Tm), slope (a), and max expression
```R Boltz<-function(data=x){ B<-nls2(gxp ~ (1+(max-1)/(1+exp((Tm-T)/a))),data=data, start=list(max=80,Tm=35,a=1.05), trace=TRUE,control=nls.control(warnOnly = TRUE, tol = 1e-05, maxiter=1000)) #summary(B) return(summary(B)$parameters) }
```
2. I'll need to convert it long format, it is in wide right now.
R names(m) [1] "Colony" "temp" "FC_hsc70_1468" "FC_Hsp83_279" [5] "FC_Hsp83_1583" "FC_hsp40_424" "T" > mlong<-gather(m,gene,gxp,FC_hsc70_1468:FC_hsp40_424)
3. fit for each colony and gene with ddply + Boltz functions
```R fits<-ddply(mlong,.(Colony,gene),Boltz) fits<-cbind(fits,rep(c("max","Tm","slope"),length(fits$Colony))) # adding parameter column names(fits)[7]<-"parameter"# renaming column
knitr::kable(fits) ```
Trying fits by removing test colony
|Colony |gene | Estimate| Std. Error| t value| Pr(>|t|)|parameter | |:------|:-------------|----------:|----------:|----------:|------------------:|:---------| |Avon |FC_hsc70_1468 | 35.8189402| 1.3830780| 25.897990| 0.0000002|max | |Avon |FC_hsc70_1468 | 37.7704625| 0.1824726| 206.992555| 0.0000000|Tm | |Avon |FC_hsc70_1468 | 1.5075619| 0.1117296| 13.492950| 0.0000103|slope | |Avon |FC_Hsp83_279 | 13.0621490| 1.7746986| 7.360207| 0.0007271|max | |Avon |FC_Hsp83_279 | 38.5802879| 0.7637267| 50.515830| 0.0000001|Tm | |Avon |FC_Hsp83_279 | 2.1031077| 0.3554831| 5.916195| 0.0019659|slope | |Avon |FC_Hsp83_1583 | 16.8751069| 2.4307114| 6.942456| 0.0004429|max | |Avon |FC_Hsp83_1583 | 38.4508017| 0.9001894| 42.714125| 0.0000000|Tm | |Avon |FC_Hsp83_1583 | 2.4352914| 0.3611821| 6.742558| 0.0005186|slope | |Avon |FC_hsp40_424 | 21.9643380| 12.1034762| 1.814713| 0.1194923|max | |Avon |FC_hsp40_424 | 40.8933831| 2.9441107| 13.889893| 0.0000087|Tm | |Avon |FC_hsp40_424 | 2.6054162| 0.6918408| 3.765919| 0.0093334|slope | |KH7 |FC_hsc70_1468 | 57.0478157| 12.0292674| 4.742418| 0.0021020|max | |KH7 |FC_hsc70_1468 | 38.2671391| 1.0235944| 37.385060| 0.0000000|Tm | |KH7 |FC_hsc70_1468 | 1.7874874| 0.5009719| 3.568039| 0.0091208|slope | |KH7 |FC_Hsp83_279 | 18.8164697| 14.2023236| 1.324887| 0.2268193|max | |KH7 |FC_Hsp83_279 | 39.5972751| 5.0039209| 7.913250| 0.0000977|Tm | |KH7 |FC_Hsp83_279 | 2.9760831| 1.4783205| 2.013152| 0.0839745|slope | |KH7 |FC_Hsp83_1583 | 16.7337144| 10.3102857| 1.623012| 0.1486163|max | |KH7 |FC_Hsp83_1583 | 40.1004665| 4.0390733| 9.928135| 0.0000224|Tm | |KH7 |FC_Hsp83_1583 | 3.0388325| 1.0845309| 2.801979| 0.0264489|slope | |KH7 |FC_hsp40_424 | 19.9496194| 14.9270787| 1.336472| 0.2231999|max | |KH7 |FC_hsp40_424 | 41.3533804| 3.8900811| 10.630467| 0.0000143|Tm | |KH7 |FC_hsp40_424 | 2.6777066| 0.8223794| 3.256048| 0.0139403|slope | |SHC6 |FC_hsc70_1468 | 30.1357724| 1.3518947| 22.291509| 0.0000001|max | |SHC6 |FC_hsc70_1468 | 36.0181917| 0.2145002| 167.916817| 0.0000000|Tm | |SHC6 |FC_hsc70_1468 | 0.7601739| 0.1966529| 3.865562| 0.0061670|slope | |SHC6 |FC_Hsp83_279 | 3.9378751| 0.3837209| 10.262341| 0.0000180|max | |SHC6 |FC_Hsp83_279 | 34.4580183| 0.8580317| 40.159376| 0.0000000|Tm | |SHC6 |FC_Hsp83_279 | 1.2755059| 0.6850160| 1.862009| 0.1049016|slope | |SHC6 |FC_Hsp83_1583 | 8.6530046| 1.6923497| 5.113012| 0.0021932|max | |SHC6 |FC_Hsp83_1583 | 36.6782852| 1.1214736| 32.705437| 0.0000001|Tm | |SHC6 |FC_Hsp83_1583 | 1.8095631| 0.6243422| 2.898352| 0.0273933|slope | |SHC6 |FC_hsp40_424 | 8.3707957| 1.0694746| 7.827017| 0.0001048|max | |SHC6 |FC_hsp40_424 | 35.6669753| 0.9166608| 38.909679| 0.0000000|Tm | |SHC6 |FC_hsp40_424 | 1.8169999| 0.6708063| 2.708680| 0.0302567|slope |
###looks like it works when there is no poor fit.
```R m<-read.csv("20160607_gxp_test.csv") mT <  − mtemp str(m)
#change to long format mlong<-gather(m,gene,gxp,FC_hsc70_1468:FC_hsp40_424) str(mlong) #mlong<-subset(mlong,mlong$Colony!="test") fits<-ddply(mlong,.(Colony,gene),failwith(f=Boltz)) ## the magical code here ```
##Table of outputs
|Colony |gene | Estimate| Std. Error| t value| Pr(>|t|)| |:------|:-------------|----------:|----------:|----------:|------------------:| |Avon |FC_hsc70_1468 | 35.8189402| 1.3830779| 25.897991| 0.0000002| |Avon |FC_hsc70_1468 | 37.7704625| 0.1824726| 206.992559| 0.0000000| |Avon |FC_hsc70_1468 | 1.5075619| 0.1117296| 13.492950| 0.0000103| |Avon |FC_Hsp83_279 | 13.0621489| 1.7746986| 7.360207| 0.0007271| |Avon |FC_Hsp83_279 | 38.5802879| 0.7637267| 50.515830| 0.0000001| |Avon |FC_Hsp83_279 | 2.1031077| 0.3554832| 5.916195| 0.0019659| |Avon |FC_Hsp83_1583 | 16.8751071| 2.4307113| 6.942456| 0.0004429| |Avon |FC_Hsp83_1583 | 38.4508017| 0.9001893| 42.714127| 0.0000000| |Avon |FC_Hsp83_1583 | 2.4352914| 0.3611821| 6.742558| 0.0005186| |Avon |FC_hsp40_424 | 21.9649309| 12.1044659| 1.814614| 0.1195088| |Avon |FC_hsp40_424 | 40.8935313| 2.9442708| 13.889188| 0.0000087| |Avon |FC_hsp40_424 | 2.6054554| 0.6918546| 3.765900| 0.0093336| |KH7 |FC_hsc70_1468 | 57.0473854| 12.0288922| 4.742530| 0.0021017| |KH7 |FC_hsc70_1468 | 38.2671031| 1.0235676| 37.386005| 0.0000000| |KH7 |FC_hsc70_1468 | 1.7874685| 0.5009659| 3.568045| 0.0091207| |KH7 |FC_Hsp83_279 | 18.8160754| 14.2013489| 1.324950| 0.2267995| |KH7 |FC_Hsp83_279 | 39.5971341| 5.0036704| 7.913618| 0.0000977| |KH7 |FC_Hsp83_279 | 2.9760359| 1.4782900| 2.013161| 0.0839733| |KH7 |FC_Hsp83_1583 | 16.7333374| 10.3095588| 1.623090| 0.1485996| |KH7 |FC_Hsp83_1583 | 40.1003166| 4.0388773| 9.928580| 0.0000224| |KH7 |FC_Hsp83_1583 | 3.0387896| 1.0845105| 2.801992| 0.0264484| |KH7 |FC_hsp40_424 | 19.9504446| 14.9288152| 1.336372| 0.2232310| |KH7 |FC_hsp40_424 | 41.3536013| 3.8903675| 10.629742| 0.0000143| |KH7 |FC_hsp40_424 | 2.6777587| 0.8223999| 3.256030| 0.0139406| |Phil |FC_hsc70_1468 | 14.4816051| 0.6238735| 23.212404| 0.0000028| |Phil |FC_hsc70_1468 | 34.8148669| 0.2209902| 157.540295| 0.0000000| |Phil |FC_hsc70_1468 | 0.8480438| 0.2387966| 3.551322| 0.0163645| |Phil |FC_Hsp83_279 | 4.6238796| 0.4489827| 10.298570| 0.0001484| |Phil |FC_Hsp83_279 | 33.7411733| 0.7422000| 45.461025| 0.0000001| |Phil |FC_Hsp83_279 | 1.2133128| 0.5981040| 2.028598| 0.0982866| |Phil |FC_hsp40_424 | 4.3629872| 0.2614315| 16.688838| 0.0000141| |Phil |FC_hsp40_424 | 34.6387089| 0.3401929| 101.820776| 0.0000000| |Phil |FC_hsp40_424 | 0.7043699| 0.3427897| 2.054816| 0.0950582| |SHC6 |FC_hsc70_1468 | 30.1357991| 1.3519005| 22.291433| 0.0000001| |SHC6 |FC_hsc70_1468 | 36.0181969| 0.2145014| 167.915909| 0.0000000| |SHC6 |FC_hsc70_1468 | 0.7601800| 0.1966547| 3.865558| 0.0061670| |SHC6 |FC_Hsp83_279 | 3.9379010| 0.3837369| 10.261982| 0.0000180| |SHC6 |FC_Hsp83_279 | 34.4580679| 0.8580653| 40.157863| 0.0000000| |SHC6 |FC_Hsp83_279 | 1.2755764| 0.6850461| 1.862030| 0.1048984| |SHC6 |FC_Hsp83_1583 | 8.6530046| 1.6923498| 5.113012| 0.0021932| |SHC6 |FC_Hsp83_1583 | 36.6782851| 1.1214737| 32.705435| 0.0000001| |SHC6 |FC_Hsp83_1583 | 1.8095631| 0.6243422| 2.898351| 0.0273933| |SHC6 |FC_hsp40_424 | 8.3707958| 1.0694747| 7.827016| 0.0001048| |SHC6 |FC_hsp40_424 | 35.6669753| 0.9166608| 38.909677| 0.0000000| |SHC6 |FC_hsp40_424 | 1.8169999| 0.6708063| 2.708680| 0.0302567| |test |FC_hsc70_1468 | 9.8719349| 0.9800918| 10.072460| 0.0000204| |test |FC_hsc70_1468 | 35.6649510| 0.8966939| 39.773830| 0.0000000| |test |FC_hsc70_1468 | 2.9884380| 0.4909301| 6.087299| 0.0004973| |test |FC_hsp40_424 | 8.0828867| 0.1090835| 74.098170| 0.0000000| |test |FC_hsp40_424 | 30.3192228| 0.1219349| 248.650901| 0.0000000| |test |FC_hsp40_424 | 1.1145318| 0.1136478| 9.806893| 0.0000243|
###Notice: That not all genes have fitted parameters! nice! ie. test hsp83's!
###Now we need to:
1. Predict new sets of values for each gene/colony 2. Visualize actual vs predicted values!
###Code to predict new values
* first, the plotting function ```R fud<-function(T=seq(25,70,.1),Tm=40,slope=1.8,max=50){ y<-1+ (max-1)/(1+exp(((Tm-T)/slope))) return(y) }
plot(fud()) ```
* OK, now the data manipulation
```R #grab fitted lines from estimates #change to wide format library(reshape2) feeder<-dcast(fits2,Colony+gene~parameter,value.var="Estimate")
list_predictions<-sapply(split(feeder,list(feederColony, feedergene)),function(x) {fud(T=seq(25,45,.1),Tm=xTm, slope = xslope,max=x$max)})
predi<-as.data.frame(do.call("rbind", list_predictions),stringAsFactors=FALSE) predi$Sample<-row.names(predi)
nom<-as.data.frame(matrix(unlist(strsplit(predi$Sample,"[.]")),ncol=2,byrow=TRUE)) #messing with the names names(nom)<-c("Colony","gene") predictions<-cbind(predi,nom) ##gotta change to long format conv<-gather(predictions,Colony,gxp,V1:V201)[,-4] #need to sort conv<-conv[order(conv$Sample),] #dont forget to order!!!
plong<-cbind(conv,rep(seq(25,45,.1),nrow(predi))) names(plong)[5]<-"T" head(plong)
```
##Plotting with ggplot
* for hsc70-4 h2 * lines = predicted fit from function * points = empirical
* hsp83 279
* hsp40 541 *

Page 31: 2016-06-08. Redoing online notebook template

I updated my online notebook template.. I probably should have done this from the start. But there is a table of contents with 200 entries with automatic links to those entries.

Code for automatically generating table of contents:

* [Page 1: Date](#id-section1). Title
* [Page 2: Date](#id-section2). Title

For table of contents, you want this syntax:

  1. I used R with a series of paste functions to get the right syntax
  2. Exported to csv and just pasted it into the markdown
#constructing table of contents
one<-rep("* [Page",200)
two<-seq(1:200)

three<-paste(one,two)
four<-paste(three,":","]",sep="")

five<-paste(four,"(#id-section",two,").",sep="")
six<-data.frame(five)
write.csv(six,"ffff.csv")

Code for automatically generating entries with titles that correspond with table of contents

For this you want this syntax:

------ 
<div id='id-section1'/>
  1. R manipulations
b<-rep("------",200)
c<-rep("<div id='id-section",200)
d<-seq(1:200)
e<-paste(d,"'/>",sep="")


m<-paste(c,e,sep="")
m

i<-rep("### Page",200)
i2<-paste(i,rep(1:200))
i3<-paste(i2,":",sep="") # can even add year here

m1<-paste(b,m,i3,sep="
          ")
write.csv(m1,"testy.csv")
  1. Export to csv
  2. You do need to get rid of header and first column manually, save and close (in excel)
  3. Open in textwrangler and you'll see that the line breaks appear. Then get rid of quotes.
### machine repaired
>Dear Andrew, The repair of your instrument on service reference notification 405638599 has been completed and is now on its way back to you. For your record the reference tracking number is 650686939762 I will be sending you a separate email with the decontamination forms and FedEx labels to return the loaner you received during the repair of yor instrument. Please send this loaner back in a timely fashion as we do have other customers in need of this loaner. Thank you, Leticia C. Instrument Services Life Sciences Solutions
### Sending back loaner
>Dear Andrew, Attached you will find the necessary paperwork to ensure that the loaner unit is returned correctly and promptly. 1. Your RMA is 14635-69 2. Please review and complete the attached decontamination form and print out 2 copies. 3. Please remember to place the instrument in the "Ship Prep" position prior to packing the instrument. 4. Please DO NOT include your power cord with your instrument (remove from unit and keep it). 5. Please DO NOT include any consumables (trays, tubes, etc.). 6. Place a copy of the completed decontamination form INSIDE and OUTSIDE of the box. 7. Print out the FedEx label, (link will arrive via separate email). The return transaction cannot be processed until the completed decontamination form and the instrument are received. Thank you, Leticia C. Instrument Services Life Sciences Solutions
We received the repaired machine back.
Here is the decomtamination form for the loaner.

Page 36: 2016-06-10. Thoughts on Kingsolver & Woods 2016, AmNat. ref here

reference: * Kingsolver JG, Woods HA. 2016. Beyond Thermal Performance Curves: Modeling Time-Dependent Effects of Thermal Stress on Ectotherm Growth Rates. The American Naturalist 187:283–294.

This paper models growth rate under heat stress over time. The authors use Hsp gene and protein expression as a measure of cost and ingestion rate as a trait that inputs energy into an animal.

Fig 1:

The physiology is more complicated than this. First, increasing Hsp gene expression is costly in itself, so there should be a separate cost term. While the actual Hsp protein expression is costly to invest into too, there is a cost for using them and also having unstable proteomes. Also, organisms can get rid of unstable proteins through degradation and haulting translation which would offsets the costs of Hsp (gene or protein) expression and using it. Basically, I'm saying the actual cost incurred come in the form of macromolecular damage (proteome stability) and the response to macromolecular damage (Hsp expression). Not sure if proteome stability cost needs to included

But here is a fig for proteome stability (prop non-denatured) as a function of Temperature:

The black line is 10 min incubations, the red line is 20 min. I fit a non-linear logistic curve to it link. This captures the incurring costs associated with temperature AND time without an acclimation response. It'd be interesting to develop a model from this....

2016-06-11. Follow up model

I've included potentially important physiological components. Macromolecular damage includes unstable proteins and damage to membranes. For simplicity, it can just represent unstable proteins. On second thought, it should be macromolecular stability, assuming there is an optimal stability of membranes and proteins for growth. So temperature directly affects macromolecular stability and given a certain amount of damage(instability), it elicits a physiological response ( transcription + translation) . Transcription includes all the transcripts that turn on and turn off. If the net effect is using more energy to turn on/off over higher temperatures, this incurs a cost. Same with translation, but there is also a cost of "using" the proteins. For example, Hsp mediated folding uses ATP. However, the combination of altering translation rates and using the proteins offsets the costs of macromolecular damage which directly affects growth.

Anyway, I'd call this the "thermostat" model.

  • Craig EA, Gross CA. 1991. Is hsp70 the cellular thermometer? Trends in Biochemical Sciences 16:135–140.

2016-06-13. Predictions of thermostat model

  1. There is some temperature where the costs associated with macromolecular damage exceeds any type of physiological response (transcription, translation), resulting in inhibited growth.
  2. Under sublethal temperature stress, the negative long term outcome of inducing a physiological response may be tempered by increasing ingestion rate.
  3. Although a physiological response is costly, at sublethal levels, the combination of gene/protein expression(downregulating unstable proteins) and upreg of Hsps may have a net positive effect on proteome stability, which is related to growth.

Note: There is a cool paper by Hoekstra & Montooth that shows how Hsp70 expression covaries with metabolic rate.

  • Hoekstra LA, Montooth KL. 2013. Inducing extra copies of the Hsp70 gene in Drosophila melanogaster increases energetic demand. BMC Evolutionary Biology 13:68.

Other thoughts:
1. One cool thing about the model is that you can add transcriptome, proteome data as parameters into the model. How?
* Count the costs of each transcript (# of basepairs) and subtract response from baseline to get relative response. One could argue that overall, Hsp expression is not costly because other transcripts can be downregulated at the same time. I don't think anybody has tried to explore this in transcriptome datasets.

  1. In aquatic systems, oxygen limitation seems to be the mechanism for upper thermal limits. Is there a way to make one global model so that we can make predictions for any ectotherm?
Alternate title: Quantifying the intensity of selection associated with climate change.
Question: Are populations experiencing selection associated with climate change out in nature?
Hypothesis: The magnitude and direction of selection acts on different parts of their range depending on their thermal environment. Predictions: 1. Individuals at the warm edge of their range experience positive directional selection for a thermal trait. 2. Individuals at the core experience stabilizing selection for a thermal trait. 3. Individuals at the cool edge experience negative directional selection for a thermal trait.
Approach: Measure phenotypic selection on physiological, behavioral traits across a cline for a given species. A good system to measure phenotypic selection are ants because alates are direct measurement of fitness. So the product of # of alates by their weights will give a meaurement of fitness. Then, regress different traits on relative fitness to obtain a selection gradient. I can detect disruptive and stabilizing selection by adding a quadratic term in the regression model. I don't want to automatically assign individuals to warm-edge, cool edge, core. I'd sample along a cline (10-20 sites?). Also, there may be differences in the phenology for alates to develop, so I'd probably need to sample 3-4 times a year?
Some key traits: 1. Colony size ( # of workers, # of larvae, # of pupae, Colony biomass really) 2. Thermal tolerance ( CTmax, Ctmin, KO-time, hardening ability) 3. Morphology ( leg length, average worker weight)
Some things to think about:
1. I read somewhere (find it) that what one really wants is the life time reproductive success (LRS). But this is almost impossible to measure. In this sense, it is more accurate to say I'm measuring episodic selection (Angiletta 2009)? 2. Also, one should be comparing within a generation. There may be different age classes of colonies, but it may be reasonable to assume that if the colony has alates, then they belong to a similar age class. 3. I'd need to do some pop gen to determine the population level structure so that I can empirically assign individuals to populations.
Another thought: Phenotypic selection seems like a good way to associate higher and lower phenotypic levels. 1. For example, I have CTmax data and the underlying stress response measured. CTmax is a component of fitness, so if I regress the stress response onto the relative fitness of CTmax(CTmax of individual/ population CTmax mean) , then I can determine a selection gradient. 2. I can also measure phenotypic selection for allele frequencies! (Dr. Goodnight's suggestion)

Page 40: 2016-06-14. qPCR's: Diluting samples for quantifying basal expression and repeats

Diluting samples for basal expression:

I diluted 1x cDNA samples 1:10, so I added 5 uL cDNA with 45 uL water. I added 25C-mid samples (because of technical mistake in diluting) for some colonies to replace 25C samples that were started at the beginning of heat shock.

  1. F10: Duke8 41 (switched with AS4)
  2. F11: SHC10 mid
  3. F12: AS4 25C
  4. G1: yates3 mid
  5. G2: shc2 mid
  6. G3: exit65 mid
  7. G4: greenfield mid

I also diluted the 1:10 cDNA samples again at 1:10 to run 18s rRNA. So I added 2 uL cDNA into 18 uL water.

All in all, it took ~ 3 hours from organization to completion.

Updated plate layout:

Row Column Colony
A 1 Ala1
A 2 KITE8
A 3 Yates2
A 4 FBRAGG3
A 5 CJ4
A 6 BK
A 7 HW7
A 8 KH3
A 9 DUKE9
A 10 SHC8
A 11 CJ2
A 12 HF2
B 1 shc7
B 2 MA
B 3 PB07-23
B 4 CJ8
B 5 Lex9
B 6 ApGxL10A
B 7 Phillips
B 8 hf3
B 9 PB17-10
B 10 CJ6
B 11 Ala4
B 12 CJ5
C 1 PB17-14
C 2 DUKE8
C 3 KH1
C 4 Greenfield
C 5 fbragg1
C 6 Avon19.1
C 7 CampNSP
C 8 KH6
C 9 KH5
C 10 DUKE2
C 11 SHC9
C 12 LPR2
D 1 KITE4
D 2 FBRAGG4
D 3 KH7
D 4 DUKE1
D 5 PMBE
D 6 DUKE6
D 7 CJ7
D 8 fbragg5
D 9 CJ1
D 10 LPR4
D 11 YATES3
D 12 POP1
E 1 kh2
E 2 Bingham
E 3 SHC3
E 4 ApGxL09A
E 5 Ted6
E 6 DUKE7
E 7 SHC6
E 8 DUKE4
E 9 DUKE5
E 10 Ted4
E 11 EXIT65
E 12 sidewalk (formica)
F 1 POP2
F 2 fbragg2
F 3 SHC2
F 4 LEX13
F 5 SHC5
F 6 cremat
F 7 SHC10
F 8 pop3
F 9 SR45
F 10 Duke 8 41
F 11 SHC10 mid
F 12 AS4
G 1 yates3 mid
G 2 shc2 mid
G 3 exit65 mid
G 4 gf mid

Repeats ran alongside CJ8

Ran hsp83 279 55 C annealing for following coloines:

  1. Fbragg1
  2. CJ1
  3. CJ8
  4. KH1; 1 row
  5. FB4; 1 row

results: Fb4 not work

Ran hsp40 541 prim 55C annealing for the same colonies as above.
results: CJ8 and KH1 worked

Ran 18s rRNA for following colonies:
1. CJ1 2. CJ8 3. KH1

results: all worked

Update of samples:

Status X18s hsc70.4_1468_1592_degen hsp83_279_392_degen hsp83_1583_1682_degen hsp40_424_525degen
works 67 58 65 45 57
double peaks 0 9 2 20 10
total 67 67 67 65 67
Hsp70, 40, 83 from top to bottom

Page 44: 2016-07-18. Summary statistics for modulation of Hsp paper.

Overall means

| | mean xp| |:-----|------------------------------:| |FC_83 | 11.218868| |FC_70 | 50.227915| |FC_40 | 10.535062| |B_83 | 1.735492| |B_70 | 1.446917| |B_40 | 1.935067| ### Comparison among genes

medians

Rearing_Temp Induction83 Basal83 Induction70 Basal70 Induction40 Basal40
20 7.046216 0.9032384 48.88187 0.4773797 6.903618 1.155806
26 10.441149 1.5197949 39.13139 2.2318297 13.267033 1.559372

means

Rearing_Temp Induction83 Basal83 Induction70 Basal70 Induction40 Basal40
20 9.352522 1.262254 55.45230 0.640272 8.059647 1.680941
26 14.320365 2.334319 42.62233 2.565149 14.086744 2.299683
Mixed effects stat models let you include random or fixed variables, implemented in (lme4 package](http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf). The difference? Summarized here in dynamic ecology blog.
As I understand it: (Using sites as an example...)
Fixed effect... * variable you're interested in * continuous or categorical * estimates values at each site, so if you have a lot of sites, it'll use more degrees of freedom * syntax: (y~x+s)
Random effect... * variable you want to control (blocking) * categorical/discrete (Can not have continuous variable as a random effect)) * estimates variance among all sites, conserves degrees of freedom (also cant calculate p values) * syntax: (yx,random=1|s ) * rule of thumb: sites should have roughly >5 levels ( 5 sites) * comment in blog post says you can think of RE as groups having different slopes and or intercepts
Typing this out seems to make more sense. Now to go over some of the syntax.... * see this * and this
This tutorial gives a good explanation.
It's hard to get p-values from mixed effects models, so one strategy is to make a full and null model with and without the variable of interest and running an anova. Don't use REML when doing these comparisons.
More syntax...
This specifies a random slope model:
R politeness.model = lmer(frequency ~ attitude + gender + (1+attitude|subject) + (1+attitude|scenario), data=politeness,REML=FALSE) This allows subjects and items to have difference slops and intercepts. Only thing changed is the random effect
Best practice to fit random slopes and intercepts! (Grueber et al. 2011, Journal of Evolutionary Biology; and the tutorial advocates for this because it reduces type I and II errors)
Notes, assumptions similar to fixed effects models
1. Check for collinearity and influential data points 2. check residuals, Q-Qplots 3. One of the main shifts from linear models to mixed effect models was to account for non-independence (measuring outcome of same individual)
### random effects note
### Writing this up in a methods section

Page 48: 2016-07-27. Meeting with Steve Keller to discuss post doc idea (started here: Page 37: 2016-06-11. Quantifying natural selection in natural populations )

Raw notes from notebook: Page 1
Page 2

Thoughts+ retyping notes:

  1. One challenge Steve brought up was that photoperiod is diff across lat and is not changing with climate. So when scientists do recipricol transplants between north and south populations, photoperiod is a confounding effect with temperature/climate.
  2. Selection gradients may not be increasing with climate if there is insufficient genetic variation to respond to selection. It could decrease. I need to think more carefully about how to connect selection gradients with population level dynamics. (I still need to read Ruth Shaw's aster modeling papers).
  3. RIght now, as I've pitched it, I have no manipulations which is something I need to determine whether temperature is actually increasing selection gradients.
  • start cohorts at different times to control development
  • for biotic interactions, manipulate floral display for pollintors
  • induce herbivory- plant stress responses
  1. I could estimate kinship matrix in natural populations with many markers(thousands) and apply quantitative genetics techniques to identify constraints between different traits.

Post doc grants:

  1. Plant genome fellowship due in november (focused on crop or crop related plants)
  • systems: sunflower, grasses, medacago, poplar(viability selection, high early life stage mortality), willows
  1. Fullbright for international opportunities
### analysis of data with pre treatment temperature as continuous within an anova ```R ## anova model k.datpretreatTemp <  − as.numeric(as.character(k.datpretreat_Temp)) cold.mod1<-aov(treatment_recovery_s~Tmin*pretreat_Temp+Colony,data=k.dat) # testing interaction between pre-treat temp and T min (both continuous)
Df Sum Sq Mean Sq F value Pr(>F) Tmin 1 116145 116145 5.755 0.018765 pretreat_Temp 1 261310 261310 12.949 0.000553 Tmin:pretreat_Temp 1 162568 162568 8.056 0.005747 Residuals 80 1614444 20181 ```
### analysis of data with pre treatment temperature as a factor within a linear model
``` ##analysis of data with pre treatment temperature as a factor within a linear model k.datpretreatTemp <  − as.factor(as.character(k.datpretreat_Temp)) cold.mod1<-lm(treatment_recovery_s~Tmin*pretreat_Temp+Colony,data=k.dat) #testing interaction between factors of pretreatment with Tmin(continuous) #summary(cold.mod1) #stepwise aic qc<-stepAIC(cold.mod1,direction="both") summary(qc)
#output: summary(qc)
Call: lm(formula = treatment_recovery_s ~ Tmin + pretreat_Temp + Tmin:pretreat_Temp, data = k.dat)
Residuals: Min 1Q Median 3Q Max -292.69 -79.96 -10.13 69.04 355.98
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 210.58 363.71 0.579 0.56432 Tmin -24.64 24.27 -1.015 0.31321 pretreat_Temp0 450.14 514.37 0.875 0.38426 pretreat_Temp25 1796.59 514.37 3.493 0.00080 pretreat_Temp5 1173.92 514.37 2.282 0.02527 Tmin:pretreat_Temp0 40.73 34.33 1.186 0.23916 Tmin:pretreat_Temp25 114.57 34.33 3.338 0.00131 Tmin:pretreat_Temp5 76.71 34.33 2.235 0.02837 *

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 124 on 76 degrees of freedom Multiple R-squared: 0.4577, Adjusted R-squared: 0.4078 F-statistic: 9.164 on 7 and 76 DF, p-value: 3.644e-08

```

More digestable table:

knitr::kable(summary(qc)$coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 210.58099 363.71495 0.5789726 0.5643197
Tmin -24.64324 24.27295 -1.0152553 0.3132054
pretreat_Temp0 450.14412 514.37061 0.8751358 0.3842574
pretreat_Temp25 1796.59479 514.37061 3.4928022 0.0008002
pretreat_Temp5 1173.91549 514.37061 2.2822367 0.0252738
Tmin:pretreat_Temp0 40.72533 34.32714 1.1863889 0.2391643
Tmin:pretreat_Temp25 114.57348 34.32714 3.3376940 0.0013101
Tmin:pretreat_Temp5 76.71280 34.32714 2.2347566 0.0283715

Hardening ability

cold.mod8<-aov(hardening~Tmin*PT+Colony,data=mew6)
qc8<-stepAIC(cold.mod8,direction="both")
summary(qc8)

  Df  Sum Sq Mean Sq F value   Pr(>F)    
Tmin         1   85850   85850   5.903  0.02055 *  
PT           2  550143  275071  18.915 3.01e-06 ***
Colony      17 1435781   84458   5.808 6.88e-06 ***
Tmin:PT      2  179795   89897   6.182  0.00513 ** 
Residuals   34  494455   14543  

Good post to read for understanding interactions here

2016-08-11 updated analyses

Basal cold tolerance re-analyzed

    df  SS  MS  F-value P-value
Tmin    1   114575  114575  6.757   0.0122
Pre-treatment   3   623523  207841  12.257  <0.001 
Tmin × Pre-treatment    3   189451  63150   3.724   0.0169
Colony  17  228419  13436   0.792   0.6931
Residuals   51  864771  16956       

Cold hardening re-analyzed (double checked)

    df  SS  MS  F-value P-value
Tmin    1   411796  411796  26.318  <0.001 
Pre-treatment   2   363498  181749  11.616  <0.001 
Tmin × Pre-treatment    2   98308   49154   3.141   0.055986
Colony  17  1285635 75626   4.833   <0.001 
Residuals   34  531992  15647       

Interaction non-significant; the change was caused by a mistake made by consolidating scripts.

1. Project updates: * Gene expression project: on hold; focusing on 2 manuscripts (multiple stressors and range limits ms) * Multiple stressors ms: SHC's hands- discussion is too disjointed, reworking organization * Range limits ms: Fixed figures, go over! * Thermal niche ms: Lacy and I are working on it. Discussion left to do * HSP modulation paper: SHC's hands * Stressed in nature MS: Curtis' hands ; he was suppose to give me a timeline * Genome sequencing? Mlau's hands * Phylogenomics of common forest ants: ADN to send Bernice samples this week. 2. Attending SICB - Jan 4-8 New Orleans, Give a talk about range limits paper. * Apply for funding. Suitor Travel Grant Deadline is october 31 3. Biolunch, working title: Strategies for achieving reproducible research ; get picture of the meeting

Page 55: 2016-08-11. Overlaying raster files in a map in R

Good link to show how to overlay here. I've had to use this to plot climate cut offs (example: here)

Some code:

Cropping world map, I set coords to region I'm interested in: Maine

w2 <- getData('worldclim', var='bio', res=.5,lat=45,lon=-68) # grab worldclim data; with .5 res you need to specify coordinates

extent<-c(-72,-65,42,48)
bew<-crop(w2,extent)

Here is the code to make cut offs: designate extreme values and then plotting it will be easy

You have to get rid of NAs and assign to variable.

Tm<-na.omit(bew[[5]])
Tm[bew[[5]] < 246.5] <- 100 # absent
Tm[bew[[5]] > 246.5] <- 1

Here is plotting the cut off

dbio2$coco<-ifelse(dbio2$Found_Notfound=="1","red","black") # specify color of points base don presence absence

plot(lar[[5]],col=c("white","grey75"),legend=F)
map("worldHires",c("USA","Canada"),add=TRUE)
map("state", c('maine','vermont','new hampshire'), add = TRUE)
points(dbio2$Lon,dbio2$Lat,pch=16,col=dbio2$coco)
Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1 ```
### mod5.r is stat diff from the other more simple models
Let's look at the output: ```R Linear mixed model fit by REML ['lmerMod'] Formula: inv_c ~ pretreat_Temp * Tmin + (1 + pretreat_Temp | Colony) Data: test
REML criterion at convergence: 525.8
Scaled residuals: Min 1Q Median 3Q Max -1.9347 -0.5625 -0.1789 0.4116 5.4326
Random effects: Groups Name Variance Std.Dev. Corr Colony (Intercept) 0.03646 0.1909 pretreat_Temp0 0.15330 0.3915 -0.23 pretreat_Temp25 0.20398 0.4516 -0.92 -0.13 pretreat_Temp5 0.26667 0.5164 -0.17 -0.50 0.50 Residual 0.32402 0.5692 Number of obs: 267, groups: Colony, 18
Fixed effects: Estimate Std. Error t value (Intercept) 3.59188 1.03188 3.481 pretreat_Temp0 -2.90454 1.68322 -1.726 pretreat_Temp25 -4.39184 1.80599 -2.432 pretreat_Temp5 -4.47550 1.96022 -2.283 Tmin 0.11598 0.06922 1.675 pretreat_Temp0:Tmin -0.23723 0.11283 -2.102 pretreat_Temp25:Tmin -0.28354 0.12088 -2.346 pretreat_Temp5:Tmin -0.30516 0.13104 -2.329
Correlation of Fixed Effects: (Intr) prt_T0 pr_T25 prt_T5 Tmin p_T0:T p_T25: pretrt_Tmp0 -0.519 prtrt_Tmp25 -0.770 0.183 pretrt_Tmp5 -0.443 -0.039 0.499 Tmin 0.997 -0.517 -0.766 -0.442 prtrt_Tm0:T -0.518 0.997 0.184 -0.037 -0.520 prtrt_T25:T -0.768 0.184 0.997 0.498 -0.770 0.185 prtrt_Tm5:T -0.443 -0.037 0.498 0.997 -0.445 -0.036 0.500 ```
### Considering only the random effect of colony
```R mod2<-lmer(formula=treatment_recovery_s.xpretreat_Temp+(1|Colony),REML=TRUE,data=test) mod3<-lmer(formula=treatment_recovery_s.xTmin+(1|Colony),REML=TRUE,data=test) mod4<-lmer(formula=treatment_recovery_s.xpretreat_Temp+Tmin+(1+pretreat_Temp|Colony),REML=TRUE,data=test) #mod5.r<-lmer(formula=inv_cpretreat_TempTmin+(1|Colony),REML=TRUE,data=test) mod6<-lmer(formula=treatment_recovery_s.x~pretreat_TempTmin+(1|Colony),REML=TRUE,data=test) anova(mod3,mod4,mod2,mod6)
mod3: treatment_recovery_s.x ~ Tmin + (1 | Colony) mod2: treatment_recovery_s.x ~ pretreat_Temp + (1 | Colony) mod4: treatment_recovery_s.x ~ pretreat_Temp + Tmin + (1 | Colony) mod6: treatment_recovery_s.x ~ pretreat_Temp * Tmin + (1 | Colony) Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) mod3 4 3628.0 3642.4 -1810.0 3620.0 mod2 6 3583.1 3604.6 -1785.5 3571.1 48.9531 2 2.344e-11 mod4 7 3577.2 3602.3 -1781.6 3563.2 7.8337 1 0.005128 mod6 10 3564.4 3600.2 -1772.2 3544.4 18.8832 3 0.000289 **

###model output for mod6R Linear mixed model fit by REML ['lmerMod'] Formula: treatment_recovery_s.x ~ pretreat_Temp * Tmin + (1 | Colony) Data: test

REML criterion at convergence: 3649.7

Scaled residuals: Min 1Q Median 3Q Max -3.2557 -0.6656 -0.1116 0.4587 3.8248

Random effects: Groups Name Variance Std.Dev. Colony (Intercept) 752.9 27.44
Residual 33965.8 184.30
Number of obs: 280, groups: Colony, 19

Fixed effects: Estimate Std. Error t value (Intercept) 207.92 291.37 0.714 pretreat_Temp0 439.58 396.48 1.109 pretreat_Temp25 1736.31 395.31 4.392 pretreat_Temp5 1215.86 399.33 3.045 Tmin -24.52 19.57 -1.253 pretreat_Temp0:Tmin 39.34 26.65 1.476 pretreat_Temp25:Tmin 109.35 26.52 4.124 pretreat_Temp5:Tmin 79.62 26.73 2.979

Correlation of Fixed Effects: (Intr) prt_T0 pr_T25 prt_T5 Tmin p_T0:T p_T25: pretrt_Tmp0 -0.678
prtrt_Tmp25 -0.681 0.500
pretrt_Tmp5 -0.674 0.495 0.497
Tmin 0.997 -0.676 -0.679 -0.672
prtrt_Tm0:T -0.676 0.997 0.498 0.493 -0.678
prtrt_T25:T -0.680 0.499 0.997 0.496 -0.682 0.501
prtrt_Tm5:T -0.674 0.496 0.497 0.997 -0.677 0.497 0.499


------    
<div id='id-section57'/>
### Page 57: 2016-08-25. Hsp modulation follow up stats

summary(aov(log10(B_40)~axis3_desig,data=mergy)) Df Sum Sq Mean Sq F value Pr(>F)
axis3_desig 3 4.947 1.6490 7.154 0.000413 Residuals 52 11.986 0.2305
--- Signif. codes: 0 ‘
’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


I separated out groupings based on phylogenetic axes. The model anova is significant.  


Now I'll do a post hoc test. 

TukeyHSD(aov(log10(B_40)~axis3_desig,data=mergy))

diff        lwr        upr     p adj

North-A. picea 0.1185330 -0.3739818 0.6110478 0.9189644 South-A. picea -1.0921848 -1.7596714 -0.4246982 0.0003710 zAxis 2 A. picea-A. picea 0.2516912 -0.5104439 1.0138263 0.8169654 South-North -1.2107178 -1.9910398 -0.4303958 0.0007709 zAxis 2 A. picea-North 0.1331582 -0.7295202 0.9958366 0.9765503 zAxis 2 A. picea-South 1.3438760 0.3706435 2.3171085 0.0031663


------    
<div id='id-section58'/>
### Page 58: 2016-08-29 & 30. Climate cascade meeting  

1. Project updates:    
 * Gene expression project: on hold; focusing on 2 manuscripts (multiple stressors and range limits ms)      
 * Multiple stressors ms: working on SHC edits   
    * Send out Wednesday. 

 * Range limits ms: **Go over figure; SHC has ms**; eta? Not looked at it. 
    * sampling map: make larger, points should be gray; sites that were used for common garden should have a gold outline
    * fig 6, cold phys; get rid of "cold", use different words.  
 
 * Thermal niche ms: **Lacey and I working on discussion**     
 * HSP modulation paper: SHC submitted     
 * Stressed in nature MS: Samples to rerun.   
    * update: Curtis can no longer work+ write on project   
    * in reference to missing samples
    * Fit in time to process Curtis' samples. 

The DF 20140717 sample box was found when we dug through all the freezers in the winter and I didn't have time to extract RNA and qPCR them all. The HF 20140812 box was the box we weren't able to find anywhere. ```

There are 74 samples: 3 days of RNA isolation + cDNA synthesis. 4 gene targets ran in duplicates is 2 plates per gene = 8 plates total. 2 days for 8 plates.

  • Genome sequencing? Mlau's hands
  • Phylogenomics of common forest ants: status?

  • Attending SICB - Jan 4-8 New Orleans, Give a talk about range limits paper.
    • construct talk; when to give practice talk ?
  • Apply for funding. Suitor Travel Grant Deadline is october 31
    • Wrote up suiter award app. I need to find out pricing and then get everything signed.
  • Biolunch, working title: Strategies for achieving reproducible research Sept 2nd.

1. Project updates: * Gene expression project: on hold; focusing on 2 manuscripts (multiple stressors and range limits ms) * Multiple stressors ms: * SHC hands * Range limits ms: Aaron made comments, go over with Nick
* Thermal niche ms: Lacey and I working on discussion * Stressed in nature MS: Samples to rerun. * update: Curtis can no longer work+ write on project * in reference to missing samples * Fit in time to process Curtis' samples.
There are 74 samples: 3 days of RNA isolation + cDNA synthesis. 4 gene targets ran in duplicates is 2 plates per gene = 8 plates total. 2 days for 8 plates.
* Attending SICB - Jan 4-8 New Orleans, Give a talk about range limits paper. * construct talk; when to give practice talk ? * Apply for funding. Suitor Travel Grant Deadline is october 31 * Wrote up suiter award app. I need to find out pricing and then get everything signed.
Notes: Only NJG and ANBE in attendance.
* Go over thesis layout next time

Page 63: 2016-09-07. PCA update for range limit data ; see * Page 63: 2016-09-07. PCA update for range limit data

Aaron wants to explore PCA decomposition of bioclim variables

PCA of all bioclim

nm<-princomp(scale(dbio2[,17:35]))
knitr::kable(round(nm$loadings[,1:4],3))

Table of loadings

Comp.1 Comp.2 Comp.3 Comp.4
MAT 0.238 -0.242 0.191 -0.079
MDR -0.192 -0.307 -0.347 0.086
ISO 0.037 -0.309 -0.614 -0.515
SD -0.267 -0.124 0.000 0.393
Tmax 0.052 -0.451 0.099 0.239
Tmin 0.281 -0.026 0.184 -0.206
TAR -0.248 -0.211 -0.129 0.327
TWQ -0.205 0.213 0.151 -0.155
TDQ 0.259 0.111 0.034 0.002
TwarmQ 0.128 -0.389 0.247 0.209
TminQ 0.274 -0.112 0.140 -0.205
AP 0.258 0.103 -0.324 0.158
PWM 0.268 0.100 -0.230 0.275
PDM 0.259 -0.108 -0.046 0.164
PSD 0.052 0.413 -0.107 0.240
PWQ 0.256 0.180 -0.215 0.198
PDQ 0.259 -0.124 -0.122 0.075
PwarmQ -0.263 0.107 -0.228 -0.014
PminQ 0.282 0.065 -0.130 0.143

Screeplot of PCA of all bioclim vars

Variance explained

summary(nm)
Importance of components:
                          Comp.1    Comp.2     Comp.3     Comp.4     Comp.5
Standard deviation     3.4169139 2.0868333 1.00881816 0.72270248 0.71067369
Proportion of Variance 0.6205736 0.2314732 0.05409423 0.02776159 0.02684514
Cumulative Proportion  0.6205736 0.8520468 0.90614101 0.93390259 0.96074773

PC1 explains 62%, PC2 explains 23%, PC3 explains 5%.

Statistical analysis: Using logistic regression, glm() function for first 3 PCs

dmo1<-glm(dbio2$Found_Notfound~pca.clim[,1]+pca.clim[,2]+pca.clim[,3],family="binomial")
summary(dmo1)

 Call:
glm(formula = dbio2$Found_Notfound ~ pca.clim[, 1] + pca.clim[, 
    2] + pca.clim[, 3], family = "binomial")

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.6588  -0.9896   0.3712   0.9299   2.3119  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -0.11715    0.24828  -0.472  0.63702    
pca.clim[, 1]  0.23114    0.08479   2.726  0.00641 ** 
pca.clim[, 2] -0.57836    0.15037  -3.846  0.00012 ***
pca.clim[, 3] -0.19877    0.24715  -0.804  0.42126    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 141.36  on 101  degrees of freedom
Residual deviance: 112.62  on  98  degrees of freedom
AIC: 120.62

Number of Fisher Scoring iterations: 5

#more digestable table
knitr::kable(round(summary(dmo1)$coefficients,3))

Table output of logistic regression

Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.117 0.248 -0.472 0.637
pc1 0.231 0.085 2.726 ** 0.006| |pc2 | -0.578| 0.150| -3.846| 0.000**| |pc3 | -0.199| 0.247| -0.804| 0.421|

Overlaying presence-absence onto climate space as represented by PCs

Aaron's thoughts

Hi Andrew,
 
The scree plot suggests both PC1 and maybe PC2, not definitely not PC3 are useful. The GLM supports this.
 
The loadings on PC2 are clear: MDR, ISO, Tmax, TwarmQ, PSD, none of which load heavily on PC1
 
But the loadings on PC1 are a mess. None exceed 0.3 in loading, and the 0.2-0.3 (absolute values) are: MAT, SD, Tmin, TAR, TDQ, TminQ, AP, PWM, PDM, PDQ, PWarmQ, and PminQ.
 
Looks to me like a lot of min temps and precip on PC1 and maxima on PC2, but I don’t know my bioclim vars.
 
But the “bowing” on the biplot is a common problem when you have more than 1 env. gradient working in the data that are working at cross-purposes. Which you described in text, and which you get out of the regression (or classification) tree (which I did get backwards – it’s about the predictee, not the predictors, but not both).
 
So my suggestion would be to stick with the CART analysis. If you must do a GLM, you should only work with uncorrelated BioClim vars. You’ll just have to choose the set a priori and defend it.
 
Best,
Aaron